287 research outputs found

    Data Handover: Reconciling Message Passing and Shared Memory

    Get PDF
    Data Handover (DHO) is a programming paradigm and interface that aims to handle data between parallel or distributed processes that mixes aspects of message passing and shared memory. It is designed to overcome the potential problems in terms of efficiency of both: (1) memory blowup and forced copies for message passing and (2) data consistency and latency problems for shared memory. Our approach attempts to be simple and easy to understand. It contents itself with just a handful of functions to cover the main aspects of coarse grained inter-operation upon data

    Efficient union-find for planar graphs and other sparse graph classes

    Get PDF
    AbstractWe solve the Union-Find Problem (UF) efficiently for the case the input is restricted to several graph classes, namely partial k-trees for any fixed k, d-dimensional grids for any fixed dimension d and for planar graphs. The result on grids allows us to perform region growing techniques that are used for image segmentation in linear time. For planar graphs we develop a technique of decomposing such a graph into small subgraphs, patching, that might be useful for other algorithmic problems on planar graphs, too.By efficiency we do not only mean linear time in a theoretical setting but also a practical reorganization of memory such that a dynamic data structures for UF is allocated consecutively

    An experimental validation of the PRO model for parallel and distributed computation

    Get PDF
    National audienceThe Parallel Resource-Optimal (PRO) computation model was introduced by Gebremedhin et al. [2002] as a framework for the design and analysis of efficient parallel algorithms. The key features of the PRO model that distinguish it from previous parallel computation models are the full integration of resource-optimality into the design process and the use of a {granularity function as a parameter for measuring quality. In this paper we present experimental results on parallel algorithms, designed using the PRO model, for two representative problems: list ranking and sorting. The algorithms are implemented using SSCRAP, our environment for developing coarse-grained algorithms. The experimental performance results observed agree well with analytical predictions using the PRO model. Moreover, by using different platforms to run our experiments, we have been able to provide an integrated view of the modeling of an underlying architecture and the design and implementation of scalable parallel algorithms

    Revise spelling of keywords: proposal for C23

    Get PDF
    Over time C has integrated some new features as keywords (some genuine, some from C++) but the naming strategy has not be entirely consistent: some were integrated using non-reserved names (const, inline) others were integrated in an underscore-capitalized form. For some of them, the use of the lower-case form then is ensured via a set of library header files. The reason for this complicated mechanism had been ackwards compatibility for existing code bases. Since now years or even decades have gone by, we think that it is time to switch and to use to the primary spelling.This is a revsion of papers to N2368 and N2392 where we reduce the focus to the list of keywords that found consensus in the WG14 London~2019 meeting. Other papers will build on this for those keywords or features that need more investigation

    Add annotations for unreachable control flow

    Get PDF
    We propose the feature unreachable to specify branches in the control flow of a program that will never be reached. The aim is to provide means for the user to express guarantees about the effective control flow that will be executed by a program. Compilers may then apply aggressive optimizations that otherwise would not be possibly or that would rely on the detection of undefined behavior for certain input combinations

    Randomized Permutations in a Coarse Grained Parallel Environment [extended abstract]

    Get PDF
    International audienceWe show how to uniformly distribute data at random (not to be confounded with permutation routing) in a coarse grained parallel environment with p processors. In contrast to previously known work, our method is able to fulfill the three criteria of uniformity, work-optimality and balance among the processors simultaneously. To guarantee the uniformity we investigate the matrix of communication requests between the processors. We show that its distribution is a generalization of the multivariate hypergeometric distribution and we give algorithms to compute it efficiently

    Verrous basés sur futex pour les opérations atomiques génériques de C11

    Get PDF
    We present a new algorithm and implementation of a lock primitive thatis based on Linux' native lock interface, the futex systemcall. It allows us to assemble compiler support for atomic datastructures that can not be handled through specific hardwareinstructions. Such a tool is needed for C11's atomicsinterface because here an _Atomic qualification can be attachedto almost any data type. Our lock data structure for that purposemeets very specific criteria concerning its field of operation and itsperformance. By that we are able to outperform gcc'slibatomic library by around 60%

    Make call_once mandatory

    Get PDF

    Basic lambdas for C: proposal for C23

    Get PDF
    We propose the inclusion of simple lambda expressions into the C standard. We build on a slightly restricted syntax of that feature in C++. In particular, they only have immutable value captures, fully specified parameter types, and, based on N2891, the return type is inferred from return statements. This is part of a series of papers for the improvement of type-generic programming in C for which the rationale is given in N2890. Follow-up papers N2894 and N2893 extend this feature with auto parameter types and default capture strategies, respectively
    • …
    corecore